We are currently looking for a Senior Machine Learning Engineer specializing in speech. In this role, you will be responsible for designing, developing, and optimizing state-of-the-art speech recognition, synthesis and processing systems. Additionally, you will play a key role in developing and implementing evaluation metrics to ensure the accuracy, reliability, and overall quality of speech models and systems.
Responsibilities:
?Design and implement advanced machine learning models for speech recognition (ASR), speech-to-speech, and text-to-speech (TTS) systems.
?Research and evaluate new algorithms, frameworks, and techniques in speech and audio processing.
?Develop efficient workflows for curating, cleaning, and annotating large-scale speech datasets, ensuring data quality and relevance for training and evaluation.
?Preprocess speech data by handling noise reduction, segmentation, feature extraction (e.g., MFCCs, spectrograms), and augmentation techniques to enhance model robustness.
?Develop robust evaluation frameworks and metrics for assessing the performance of speech models.
?Fine-tune and deploy pre-trained open-source models for speech-related tasks, such as ASR, speech diarization, time-stamps alignment, role detection, PII removal, and speech language detection.
?Collaborate with cross-functional teams, including data engineers, software developers, and product managers, to integrate speech technologies into end-user applications.
?Stay up to date with advancements in speech technology and contribute to the company's technical strategy in this domain.
Requirements:
?Bachelor's, Master's, or Ph.D. in Computer Science, Electrical Engineering, or a related field.
?5+ years of experience in machine learning, with at least 2 years focused on speech or audio processing.
?Strong understanding of speech processing concepts, including ASR, TTS, and acoustic modeling.
?Proficiency in Python and experience with PyTorch.
?Knowledge of deep learning architectures like CNNs, and transformers.
?Solid knowledge of signal processing techniques and feature extraction methods (e.g., MFCCs, spectrograms).
?Strong understanding and quality of speech evaluation techniques and metrics.
?Familiarity with cloud platforms (e.g., AWS, GCP, Azure) and ML operations (MLOps) best practices.
?Strong problem-solving skills and the ability to work independently on complex projects.
?Hands-on experience with open-source speech models and datasets. (a plus)
?Speech applications in healthcare (a plus).
All qualified applicants will receive consideration for employment without regard to race, color, national origin, age, ancestry, religion, sex, sexual orientation, gender identity, gender expression, marital status, disability, medical condition, genetic information, pregnancy, or military or veteran status. We consider all qualified applicants, including those with criminal histories, in a manner consistent with state and local laws, including the California Fair Chance Act, City of Los Angeles' Fair Chance Initiative for Hiring Ordinance, and Los Angeles County Fair Chance Ordinance. For unincorporated Los Angeles county, to the extent our customers require a background check for certain positions, the Company faces a significant risk to its business operations and business reputation unless a review of criminal history is conducted for those specific job positions.